-
Notifications
You must be signed in to change notification settings - Fork 427
fix: allow reading pyarrow timestamptz as iceberg timestamp #2708
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: allow reading pyarrow timestamptz as iceberg timestamp #2708
Conversation
|
I'm comfortable with having this patch in the read path, but it looks like we're also implicitly adding this to the write path: iceberg-python/pyiceberg/io/pyarrow.py Line 2576 in e759044
I'm reluctant to add it there since we want to enforce the difference between |
|
thanks! restricting the write side makes sense. Let me take a look at how we can do that |
f5ff73a to
9f637fb
Compare
|
@Fokko Thanks for catching the write path! I added a flag to enable this behavior only for the read path, and added tests to verify. |
geruh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @kevinjqliu! I was able to test the behavior locally and read path correctly allows the UTC timestamptz to timestamp conversion, while the write path rejects it. Also, the CI shows that all the existing ts tests pass!
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this @kevinjqliu, and thanks @geruh for the review 🚀
Rationale for this change
Closes #2663
Relates to #2333 which allow reading pyarrow timestamp as iceberg timestamptz
This PR allows PyArrow timestamptz to be read as Iceberg timestamp.
Although this configuration does not conform to the Iceberg spec, the change aligns PyIceberg's behavior with Spark when reading mismatched types (timestamptz as iceberg timestamp and timestamp as iceberg timestamptz)
The write path remains strict and will reject this type mismatch.
Are these changes tested?
Yes
Are there any user-facing changes?
No